2.快速开始-单机hbase

手册描述了运行在本地文件系统的单机版hbase的例子,这个配置并不适用于hbase生产环境,但是可以让你可以用hbase进行试验,这一小节教你如何使用hbaseshell客户端在hbase上创建一张表,并在表中插入一列,进行put/scan操作,使表有效、无效,开启或停止hbase 在本地文件系统上使用hbase并不能保证持久性。如果文件没有正常关闭,HDFS本地文件系统将失去编辑。当你在练习新的软件、开启或关闭服务时,这很有可能发生,你需要在hdfs上启动hbase以确保所有写入都被保存。在本地上运行是让你熟悉基本系统运行的捷径。

2.2. 开始hbase

procedure: 下载、配置并开启hbase

1.从apache download mirrors下载镜像

2.解压

$ tar xzvf hbase-2.0.0-SNAPSHOT-bin.tar.gz

$ cd hbase-2.0.0-SNAPSHOT/

3.HBase 0.98.5之后,在开始hbase之前,你需要设置环境变量JAVA_HOME,hbase提供了一个central mechanism 编辑conf/hbase-env.sh,去掉JAVA_HOME这行之前的批注#,根据你的系统设定好,bin/java

4.conf/hbase-site.xml是hbase的主要配置文件,当前你只要指定HBASE和zk写数据的本地目录。默认情况下,会在/tmp下创建一个新的目录 Many servers are configured to delete the contents of /tmp upon reboot, so you should store the data elsewhere.

  1. <configuration>
  2. <property>
  3. <name>hbase.rootdir</name>
  4. <value>file:///home/testuser/hbase</value>
  5. </property>
  6. <property>
  7. <name>hbase.zookeeper.property.dataDir</name>
  8. <value>/home/testuser/zookeeper</value>
  9. </property>
  10. </configuration>

不需要创建hbase数据目录,hbase会去创建,如果你创建了,hbase会去进行一个迁移、复制,而你并不想

5.bin/start-hbase.sh 提供了启动hbase的方便的方法,执行它,jps来验证你有个hmaster的进程。在单机模式下hbase的所有服务都运行在一个jvm上 hmaster hregionserver zk

Procedure: Use HBase For the First Time

1.连接

使用hbase shell连接你正在运行的hbase

2.hbase help

注意:表名(table name)、rows columns 都在引号中

3.创建表

使用create命令来创建表,你必须指定表名和ColumnFamily名 create ‘test’, ‘cf’

4.list命令

list ‘test’

5.向表中传入数据 put

hbase(main):003:0> put ‘test’, ‘row1’, ‘cf:a’, ‘value1’

这里我们插入3个值,第一个插入在row1, column cf:a,值value1.hbase中的列是由一个列族前缀cf跟着一个冒号,然后是column qualifier后缀a

6.scan

7.获取单独一行数据get ‘test’, ‘row1’

8.disable a table

9.删掉drop

10.退出

Procedure: Stop HBase

bin/stop-hbase.sh

2.3.伪分布式本地安装

伪分布式是指HBase依然在一个主机上运行,但hbase服务(Hmaster/hreginserver/zk)run as a separate process

1.关闭hbase

2.配置

hbase-site.xml
hbase.cluster.distributed true

这项配置代表hbase在distributed mode上运行,下一步,更改hbase.rootdir从本地文件系统到你的hdfs上,使用hdfs://// URI符号 You do not need to create the directory in HDFS. HBase will do this for you. If you create the directory, HBase will attempt to do a migration, which is not what you want.

3.开启hbase

bin/start-hbase.sh 如果配置正确,jps显示 HMaster and HRegionServer processes running

4.检查hdfs中hbase目录

hadoop fs -ls /hbase

5.创建表并用数据填充(populate)

6.启动停止一个备份的hbase主节点。

HMaster server controls the HBase cluster你可以启动9个备份hmaster节点,这样算上主的,总共有10个hmasters To start a backup HMaster, use the local-master-backup.sh 对每个你想要的备份主节点,给他增加一个代表端口偏移的参数。每个HMaster使用3个端口(默认:16010,16020,16030),所以下面的命令启动3个备份节点使用端口16012/16022/16032, 16013/16023/16033, and 16015/16025/16035

local-master-backup.sh 2 3 5 要kill一个备份master而不关闭整个集群,首先要找到进程id(PID) ,在/tmp/hbase-USER-X-master.pid中,

cat /tmp/hbase-testuser-1-master.pid |xargs kill -9

7.Start and stop additional RegionServers

HRegionServer管理数据,一般来讲一个HRegionSerever管理集群中的一个节点,运行多个HRS在同一个系统中,可以用来测试伪分布式 local-regionservers.sh It works in a similar way to the local-master-backup.sh command, in that each parameter you provide represents the port offset for an instance.每个RS需要两个端口,默认端口是16020和16030,然而增加的RegionServer基本端口不是默认端口,因为默认端口被HMASTER使用,这个端口自hbase1.0开始也作为RegionServer,基本的端口用16200和16300代替。我们可以在一台服务器上运行99个额外的RegionServers,且可以这台服务器不是HMaster或备份HMaster。下面的命令在连续的端口(16202/16302)上启动四个增加的RS

$ .bin/local-regionservers.sh start 2 3 4 5

8.stop hbase

2.4. Advanced - Fully Distributed

现实中,我们需要一个分布式的配置去完整的测试HBase并在真实的场景中使用它。在分布式配置中,集群包含多个节点,每个运行一个或多个HBase服务。包含主备MASTER,多个ZK节点,多个RegionSever节点。本文增加两个额外的节点到集群中,具体结构

Procedure: Configure Passwordless SSH Access

1.node-a:$ ssh-keygen -t rsa 2.on node-b/c:create a .ssh/ directory in the user’s home directory 3.Copy the public key to the other nodes. On each of the other nodes, create a new file called .ssh/authorized_keys if it does not already exist, and append the contents of the id_rsa.pub file to the end of it. Note that you also need to do this for node-a itself.

$ cat id_rsa.pub >> ~/.ssh/authorized_keys

  1. Test password-less login.

Procedure: Prepare node-a

node-a将运行primary master和zk,但没有RS,所以停止node-a上的RS

1.Edit conf/regionservers and remove the line which contains localhost. Add lines with the hostnames or IP addresses for node-b and node-c.

即使你特别想运行RS在node-a,你应该用hostname来包含它,其他服务器将用来与它通信

2.配置HBase在node-b上做为backup master

Create a new file in conf/ called backup-masters, and add a new line to it with the hostname for node-b. In this demonstration, the hostname is node-b.example.com.

3.配置ZK

现实中,一定要认真考虑ZK的配置。这个配置将direct(指导?)hbase启动和管理一个ZK实例

On node-a, edit conf/hbase-site.xml and add the following properties.

hbase.zookeeper.quorum node-a.example.com,node-b.example.com,node-c.example.com hbase.zookeeper.property.dataDir /usr/local/zookeeper

Procedure: Prepare node-b and node-c

node-b will run a backup master server and a ZooKeeper instance.

1.Download and unpack HBase.

Download and unpack HBase to node-b, just as you did for the standalone and pseudo-distributed quickstarts.

2.Copy the configuration files from node-a to node-b.and node-c.

Each node of your cluster needs to have the same configuration information. Copy the contents of the conf/ directory to the conf/ directory on node-b and node-c.

Procedure: Start and Test Your Cluster

1.Be sure HBase is not running on any node.

2.Start the cluster.

On node-a, issue the start-hbase.sh command. Your output will be similar to that below. ZK先起来,然后是master,Rs最后是backup masters

ZooKeeper Process Name The HQuorumPeer process is a ZooKeeper instance which is controlled and started by HBase.

Web UI Port Changes hbase0.98x以后版本,HTTP端口(hbase web ui)从60010(master)60030(each RS)变为 16010(master)16030(RS)